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THERMOSTABLE POLYMERASES HAVING ALTERED FIDELITY AND 
METHODS OF IDENTIFYING AND USING SAME 



This application claims the benefit of priority 
of United States Provisional Application serial No. 
60/031,496, filed November 27, 1996, the entire contents 
of which is incorporated herein by reference. 

This invention was made with government support 
under grant number OIG-R35-CA-3 9 903 awarded by the 
National Institutes of Health and grant number BIR9214821 
awarded by the National Science Foundation. The 
government has certain rights in the invention. 

BACKGROUND OF THE INVENTION 

The present invention relates generally to 
thermostable polymerases and more specifically to methods 
for identifying polymerase mutants having desired 
fidelity . 

Every living organism requires genetic 
material, deoxyribonucleic acid ( DNA) , to pass a unique 
collection of characteristics to its offspring. Genes 
are discreet segments of the DNA and provide the 
information required to generate a new organism. Even 
simple organisms, such as bacteria, contain thousands of 
genes, and the number is many fold greater in complex 
organisms such as humans. Understanding the complexities 
of the development and functioning of living organisms 
requires knowledge of these genes. However, the amount 
of DNA that can be isolated for study has often been 
limiting . 

A major breakthrough in the study of genes was 
the development of the polymerase chain reaction (PCR) . 



PCR amplifies genes or portions of genes by making many 
identical copies, allowing isolation of genes from very 
tiny amounts of DNA. The motors for PCR are DNA 
polymerases that copy the DNA of each gene during each 
round of DNA synthesis. Using oligonucleotides that 
determine the start and termination of DNA synthesis, a 
single gene can be replicated into millions of copies. 
This process has created a revolution in biotechnology 
and has been used extensively for the identification of 
mutant genes that are responsible for or associated with 
inherited human diseases. It is now possible to identify 
a mutant gene in a single cell, amplify the gene a 
million times, and establish the nature of the mutation. 
One application of identifying a mutant gene is the 
determination of genetic susceptibility to disease, which 
can be mapped by gene amplification and DNA sequencing. 

DNA polymerases function in cells as the 
enzymes responsible for the synthesis of DNA. They 
polymerize deoxyribonucleoside triphosphates in the 
presence of a metal activator, such as Mg 2+ , in an order 
dictated by the DNA template or polynucleotide template 
that is copied. Even though the template dictates the 
order of nucleotide subunits that are linked together in 
the newly synthesized DNA, these enzymes also function to 
maintain the accuracy of this process. The contribution 
of DNA polymerases to the fidelity of DNA synthesis is 
mediated by two mechanisms. First, the geometry of the 
substrate binding site in DNA polymerases contributes to 
the selection of the complementary deoxynucleoside 
triphosphates. Mutations within the substrate binding 
site on the polymerase can alter the fidelity of DNA 
synthesis. Second, many DNA polymerases contain a 
proof-reading 3 '-5' exonuclease that preferentially and 
immediately excises non-complementary deoxynucleoside 



triphosphates if they are added during the course of 
synthesis. As a result, these enzymes copy DNA in vitro 
with a fidelity varying from 5 X 10" 4 (1 error per 2000 
bases) to 10' 7 (1 error per 10 7 bases) (Fry and Loeb, 
Animal Cell DNA Po lymerases, pp. 221, CRC Press, Inc., 
Boca Raton, FL.(1986); Kunkel, T.A., J. Biol. Chem. 
267:18251-18254 (1992) ) . 

In vivo, DNA polymerases participate in a 
spectrum of DNA synthetic processes including DNA 
replication, DNA repair, recombination, and gene 
amplification (Kornberg and Baker, DNA Replication, pp. 
929, W.H. Freeman and Co., New York (1992)). During each 
DNA synthetic process, the DNA template is copied once or 
at most a few times to produce identical replicas . In 
vitro DNA replication, in contrast, can be repeated many 
times, for example, during PCR. 

In the initial studies with PCR, the DNA 
polymerase was added at the start of each round of DNA 
replication. Subsequently, it was determined that 
thermostable DNA polymerases could be obtained from 
bacteria that grow at elevated temperatures, and these 
enzymes need to be added only once. At the elevated 
temperatures used during- PCR, these enzymes would not 
denature. As a result, one can carry out repetitive 
cycles of polymerase chain reactions without adding fresh 
enzymes at the start of each synthetic addition process. 
The commercial market for the sale of DNA polymerases 
from thermostable organisms can be conservatively 
estimated at 200 million dollars per year. DNA 
polymerases, particularly thermostable polymerases, are 
the key to a large number of techniques in recombinant 
DNA studies and in medical diagnosis of disease. 



Due to the importance of DNA polymerases in 
biotechnology and medicine, it would be highly 
advantageous to generate DNA polymerases having desired 
enzymatic properties such as altered fidelity. However, 
5 the ability to predict the effect of introducing an amino 
acid mutation into the sequence of a protein remains very 
limited. Even when structural information is available 
for the protein of interest, it is often very difficult 
to predict the effect of mutations of specific amino acid 
10 residues on the function of that protein. In particular, 
it is extremely difficult to predict amino acid 
substitutions that will alter the activity of an enzyme 
to achieve a desirable change. 

Despite the limitations in predicting the 

15 effect of introducing amino acid substitutions into 

proteins, a number of mutant DNA polymerases have been 
discovered, or have been created by site-specific 
mutagenesis, and have been used in PCR amplification 
(Tabor and Richardson, Proc. Natl. Acad. Sci. USA 

20 92:6339-6343 (1995)). Some of these mutant polymerases 
offer particular advantages with respect to 
thermostability, processivity , length of the newly 
synthesized DNA product, or fidelity of DNA synthesis. 
Those that are more accurate for the most part contain a 

25 3 '-5' exonuclease activity that removes misincorporated 
bases prior to adding the next nucleotide during DNA 
synthesis. However, the current spectrum of mutant DNA 
polymerases is quite limited. For the most part, these 
mutants have been obtained by introducing a single base 

30 substitution at a specified site, purifying the enzyme 

and studying the changes in catalytic activity (Joyce and 
Steitz, Annu. Rev. Biochem. 63:777-822 (1994)). These 
laborious and step-wise procedures have been necessary 
due to the lack of adequate knowledge to predict the 



effects of most single amino acid substitutions and due 
to the lack of rules for predicting the effects of 
multiple simultaneous substitutions . 

Thus, there exists a need for rapid and 
efficient methods to produce and screen for modified 
polymerases having desired fidelity in polynucleotide 
synthesis. The present invention satisfies this need and 
provides related advantages as well. 

SUMMARY OF THE INVENTION 

The present invention provides a method for 
identifying a thermostable polymerase having altered 
fidelity. The method consists of generating a random 
population of polymerase mutants by mutating at least one 
amino acid residue of a thermostable polymerase and 
screening the population for one or more active 
polymerase mutants by genetic selection. For example, 
the invention provides a method for identifying a 
thermostable polymerase having altered fidelity by 
mutating at least one amino acid residue in an active 
site O-helix of a thermostable polymerase. The invention 
also provides thermostable polymerases and nucleic acids 
encoding thermostable polymerases having altered 
fidelity, for example, high fidelity polymerases and low 
fidelity polymerases. The invention additionally 
provides a method for identifying one or more mutations 
in a gene by amplifying the gene with a high fidelity 
polymerase. The invention further provides a method for 
accurately copying repetitive nucleotide sequences using 
a high fidelity polymerase mutant. The invention also 
provides a method for diagnosing a genetic disease using 
a high fidelity polymerase mutant. The invention further 
provides a method for randomly mutagenizing a gene by 



amplifying the gene using a low fidelity polymerase 
mutant . 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows the nucleotide and amino acid 
sequence of Taq DNA polymerase I ( SEQ ID NOS : 1 and 2, 
respectively) . 

Figure 2 shows a compilation of amino acid 
substitutions identified in a screen of Taq DNA 
polymerase I mutants. Panel A shows single mutations, 
which were identified in the screen of a 9% library, 
listed under the wild type amino acids. Panel B shows 
the sequence of multiply substituted mutants identified 
in the screen of a 9% library. Panel C shows mutations 
selected from a totally random library of selected amino 
acids . 

Figure 3 shows the spectrum of single base 
changes generated in a forward mutation assay by Taq DNA 
polymerase I mutant Thr664Arg. 

nF.TAII.ED DESCRIPTION OF THE I NVENTION 

The invention is directed to methods for 
screening and identifying thermostable polymerases that 
have altered fidelity of DNA synthesis as well as to the 
resultant polymerase compositions. As disclosed herein, 
the invention provides rapid and efficient methods to 
identify polymerase mutants having altered fidelity. 
These methods are applicable to the identification of 
polymerase mutants having a desired activity such as high 
fidelity or low fidelity. An advantage of the methods is 
that they use a population of polymerase mutants to 



rapidly identify active polymerase mutants having altered 
fidelity. The identification of low fidelity mutants is 
useful for introducing mutations into specific genes due 
to the increased frequency of misincorporation of 
nucleotides during error-prone PCR amplification. The 
identification of high fidelity mutants is useful for PCR 
amplification of genes and for mapping of genetic 
mutations. The methods of the invention can therefore be 
advantageously applied to the identification of 
polymerase mutants useful for the characterization of 
specific genes and for the identification and diagnosis 
of human genetic diseases. 

As used herein, the term "polymerase" is 
intended to refer to an enzyme that polymerizes 
nucleoside triphosphates. Polymerases use a template 
nucleic acid strand to synthesize a complementary nucleic 
acid strand. The template strand and synthesized nucleic 
acid strand can independently be either DNA or RNA. 
Polymerases can include, for example, DNA polymerases 
such as Escherichia coli DNA polymerase I and Thermus 
aquaticus {Taq) DNA polymerase I, DNA-dependent RNA 
polymerases and reverse transcriptases. The polymerase 
is a polypeptide or protein containing sufficient amino 
acids to carry out a desired enzymatic function of the 
polymerase. The polymerase need not contain all of the 
amino acids found in the native enzyme but only those 
which are sufficient to allow the polymerase to carry out 
a desired catalytic activity. Catalytic activities 
include, for example, 5 ' -3 ' polymerization, 5' -3' 
exonuclease and 3' -5' exonuclease activities. 

As used herein, the term "polymerase mutant" is 
intended to refer to a polymerase that contains one or 
more amino acids that differ from a selected polymerase. 



The selected polymerase is determined based on desired 
enzymatic properties and is used as a parent polymerase 
to generate a population of polymerase mutants. A 
selected polymerase can be, for example, a wild type 
polymerase as isolated from an organism or can be a 
mutant polymerase that differs from a wild type 
polymerase by one or more amino acids and has desirable 
enzymatic properties. As disclosed herein, a 
thermostable polymerase such as Taq DNA polymerase I can 
be selected, for example, as a polymerase to generate a 
population of polymerase mutants. 

As used herein, the term "population" is 
intended to refer to a group of two or more different 
molecular species. Molecular species differ by some 
detectable property such as a difference in at least one 
amino acid residue or at least one nucleotide residue or 
a difference introduced by the modification of an amino 
acid such as the addition of a chemical functional group. 
For example, a population of polymerase mutants would 
contain two or more different polymerase mutants. 
Typically, populations can be as small as two species and 
as large as 10 12 species. In some embodiments, 
populations are between about five and 20 different 
species as well as up to -hundreds or thousands of 
different species. In other embodiments, populations can 
be, for example, greater than 10 4 , 10 s and 10 6 different 
species. In the specific example presented in Example I, 
the population described therein is 50,000 different 
species. In yet other embodiments, populations are 
between about 10 6 -10 8 or more different species. Those 
skilled in the art will know a suitable size and 
diversity of a population sufficient for a particular 
application . 



A population of polymerase mutants consists of 
two or more mutant polymerases which differ by at least 
one amino acid from the parent polymerase. A population 
of polymerase mutants can consist, for example, of 
multiple substitutions of a single amino acid residue 
where the substitutions are changes to any or all of the 
non-parental, naturally occurring amino acids at that 
amino acid position. In this example, the population 
would comprise nineteen members, and all members of the 
polymerase mutant population would consist of nineteen 
different amino acid substitutions at a single amino acid 
position. A population of polymerase mutants can also 
consist, for example, of at least one substitution at two 
or more different amino acid positions. In this example, 
a minimal population containing two polymerase mutants 
would consist of a single amino acid substitution at two 
different positions. Such a population can be expanded 
with the addition of substitutions to any or all of the 
19 non-parental amino acids at these two amino acid 
positions or additional amino acid positions. 

As used herein, the term "random" when used in 
reference to a population is intended to refer to a 
population of molecules generated without limiting the 
molecules to contain predetermined specific residues. 
Such a population excludes molecules in which a specific 
residue is substituted with a specific predetermined 
residue and individually assayed to determine its 
activity. The residues can be amino acid residues or 
nucleotide residues encoding a codon . The random 
molecules can be generated, for example, by introducing 
random nucleotides into an oligonucleotide sequence that 
encodes an amino acid sequence of a protein region of 
interest (see Example I) . Thus, a random population is 
generated to contain random oligonucleotide sequences 



which can be expressed in appropriate cells to generate a 
random population of expressed proteins. A specific 
example of such a random population is the population of 
polymerase mutants described in Example I that were 
generated to screen for active polymerase mutants having 
altered fidelity. 

As used herein, the term "catalytic activity" 
or "activity" when used in reference to a polymerase is 
intended to refer to the enzymatic properties of the 
polymerase. The catalytic activity includes, for 
example: enzymatic properties such as the rate of 
synthesis of nucleic acid polymers; the for substrates 
such as nucleoside triphosphates and template strand; the 
fidelity of template-directed incorporation of 
nucleotides, where the frequency of incorporation of 
non-complementary nucleotides is compared to that of 
complementary nucleotides; processivity , the number of 
nucleotides synthesized by a polymerase prior to 
dissociation from the DNA template; discrimination of the 
ribose sugar; and stability, for example, at elevated 
temperatures. Polymerases can discriminate between 
templates, for example, DNA polymerases generally use DNA 
templates and RNA polymerases generally use RNA 
templates, whereas reverse transcriptases use both RNA 
and DNA templates . DNA polymerases also discriminate 
between deoxyribonucleoside triphosphates and 
dideoxyribonucleoside triphosphates. Any of these 
distinct enzymatic properties can be included in the 
meaning of the term catalytic activity, including any 
single property, any combination of properties or all of 
the properties. Although specific embodiments 
identifying polymerase mutants having altered fidelity 
are exemplified herein, the methods of the invention can 
similarly be applied to identify polymerases having 



altered catalytic activity distinct from altered 
fidelity. 

As used herein, the terra "fidelity" when used 
in reference to a polymerase is intended to refer to the 
accuracy of template-directed incorporation of 
complementary bases in a synthesized DNA strand relative 
to the template strand. Fidelity is measured based on 
the frequency of incorporation of incorrect bases in the 
newly synthesized nucleic acid strand. The incorporation 
of incorrect bases can result in point mutations, 
insertions or deletions. Fidelity can be calculated 
according to the procedures described in Tindall and 
Kunkel ( Ri nchemistrv 27:6008-6013 (1988)). Methods for 
determining fidelity are well known in the art and 
include, for example, those described in Example III. A 
polymerase or polymerase mutant can exhibit either high 
fidelity or low fidelity. As used herein, the term "high 
fidelity" is intended to mean a frequency of accurate 
base incorporation that exceeds a predetermined value. 
Similarly, the term "low fidelity" is intended to mean a 
frequency of accurate base incorporation that is lower 
than a predetermined value. The predetermined value can 
be, for example, a desired frequency of accurate base 
incorporation or the fidelity of a known polymerase. 

As used herein, the term "altered fidelity" 
refers to the fidelity of a polymerase mutant that 
differs from the fidelity of the selected parent 
polymerase from which the polymerase mutant is derived. 
The altered fidelity can either be higher or lower than 
the fidelity of the selected parent polymerase. Thus, 
polymerase mutants with altered fidelity can be 
classified as high fidelity polymerases or low fidelity 
polymerases. Altered fidelity can be determined by 



assaying the parent and mutant polymerase and comparing 
their activities using any assay that measures the 
accuracy of template directed incorporation of 
complementary bases. Such methods for measuring fidelity 
include, for example, those described in Example III as 
well as other methods known to those skilled in the art. 

As used herein, the term "immutable" when used 
in reference to an amino acid residue is intended to 
refer to an amino acid residue which cannot be 
substituted with another amino acid residue and still 
retain measurable function of the polypeptide. An 
immutable amino acid residue can be determined by 
introducing one or more substitutions of an amino acid 
residue and assaying the resulting mutant polypeptides 
for polypeptide function. An immutable residue can be 
identified, for example, using site-directed mutagenesis 
to substitute each of the 19 non-parental amino acids at 
a given position and determining if any of these mutants 
are active. Random mutagenesis can also be employed to 
introduce substitutions of each of the nineteen, 
naturally occurring non-parental amino acids at a given 
position. Random mutagenesis can provide a statistical 
representation of all 20 amino acids at a given position. 
Sequencing of polymerase - mutant s allows determination of 
whether a given amino acid residue can tolerate any 
mutations. Assays for determining the function of mutant 
polypeptides include in vitro enzymatic assays as well as 
genetic complementation assays such as those described in 
Example I. If substitution of an amino acid residue with 
any other amino acid results in loss of polypeptide 
function, then that amino acid residue is considered to 
be immutable. 



As used herein, the term "nearly immutable" 
when used in reference to an amino acid residue is 
intended to refer to an amino acid residue which can only 
tolerate conservative substitutions and still retain 
polypeptide function. Conservative amino acids are known 
to those skilled in the art and include those amino acids 
which have similar structure and chemical properties. 
Conservative substitutions of amino acids include, for 
example, the identification of amino acid substitutions 
based on the frequencies of amino acid changes between 
corresponding proteins of homologous organisms (Schulz 
and Schirmer, Principles of Pr otein Structure, Springer 
Verlag, New York (1979)). 

As used herein, the term "substantially" or 
"substantially the same" when used in reference to a 
nucleotide or amino acid sequence is intended to mean 
that the function of the polypeptide encoded by the 
nucleotide or amino acid sequence is essentially the same 
as the referenced parental nucleotide or amino acid 
sequence. For example, changes in a nucleotide or amino 
acid sequence that results in substitution of amino acids 
that differ from the parent molecule but that do not 
alter the desired activity of the encoded polypeptide 
would result in substantially the same sequence. A 
nucleotide or amino acid sequence is substantially the 
same if the difference in that sequence from the 
reference parental sequence does not result in any 
measurable difference in the desired activity of the 
encoded polypeptide. 

The invention provides a method for identifying 
a thermostable polymerase having altered fidelity. The 
method consists of generating a random population of 
polymerase mutants by mutating at least one amino acid 
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residue of a thermostable polymerase and screening the 
population for one or more active polymerase mutants by 
genetic selection. 

The generation and identification of 
polymerases having altered fidelity or altered catalytic 
activity is accomplished by first creating a population 
of mutant polymerases through random sequence mutagenesis 
of regions within the polymerase that can influence the 
fidelity of polymerization (Loeb, L.A., Adv. Pharmacol. 
35:321-347 (1996)). The identification of active mutants 
is performed in vivo and is based on genetic 
complementation of conditional polymerase mutants under 
non-permissive conditions. Once identified, the active 
polymerases are then screened for fidelity of 
polynucleotide synthesis. 

The methods of the invention employ a 
population of polymerase mutants and the screening of the 
polymerase mutant population to identify an active 
polymerase mutant. Using a population of polymerase 
mutants is advantageous in that a number of amino acid 
substitutions including single amino acid and multiple 
amino acid substitutions can be examined for their effect 
on polymerase fidelity. -The use of a population of 
polymerase mutants increases the probability of 
identifying a polymerase mutant having a desired 
fidelity . 

Screening a population of polymerase mutants 
has the additional advantage of alleviating the need to 
make predictions about the effect of specific amino acid 
substitutions on the activity of the polymerase. The 
substitution of single amino acids has limited 
predictability as to its effect on enzymatic activity and 



the effect of multiple amino acid substitutions is 
virtually unpredictable. The methods of the invention 
allow for screening a large number of polymerase mutants 
which can include single amino acid substitutions and 
5 multiple amino acid substitutions. In addition, using 
screening methods that select for active polymerase 
mutants has the additional advantage of eliminating 
inactive mutants that could complicate screening 
procedures that require purification of polymerase 
10 mutants to determine activity. 

Moreover, the methods of the invention allow 
for targeting of amino acid residues adjacent to 
immutable or nearly immutable amino acid residues. 
Immutable or nearly immutable amino acid residues are 

15 residues required for activity, and those immutable 
residues located in the active site provide critical 
residues for polymerase activity. Mutating amino acid 
residues adjacent to these required residues provides the 
greatest likelihood of modulating the activity of the 

20 polymerase. Introducing random mutations at these sites 
increases the probability of identifying a mutant 
polymerase having a desired alteration in activity such 
as altered fidelity. 

A polymerase is selected as a parent polymerase 
25 to introduce mutations for generating a library of 

mutants. Polymerases obtained from thermophlic organisms 
such as Thermus aquaticus have particularly desirable 
enzymatic characteristics due to their stability and 
activity at high temperatures. Thermostable polymerases 
30 are stable and retain activity at temperatures greater 
than about 37°C, generally greater than about 50°C, and 
particularly greater than about 90°C. The use of the 
thermostable polymerase Taq DNA polymerase I as a parent 
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polymerase to generate polymerase mutants is disclosed 
herein (see Example I) . 

Although a specific embodiment using Taq DNA 
polymerase I is disclosed in the examples, the methods of 
the invention can similarly be applied to other 
thermostable polymerases other than Thermus aquaticus DNA 
polymerases. Such other polymerases include, for 
example, RNA polymerases from Thermus aquaticus and RNA 
and DNA polymerases from other thermostable bacteria. 
Using the guidance provided herein in reference to DNA 
polymerases, those skilled in the art can apply the 
teachings of the invention to the generation and 
identification of these other polymerases having altered 
fidelity of polynucleotide synthesis. 

In addition to creating mutant DNA polymerases 
from organisms that grow at elevated temperatures, the 
methods of the invention can similarly be applied to non- 
thermostable polymerases provided that there is a 
selection or screen such as the genetic complementation 
of a conditional polymerase mutation as described herein 
(see Example I) . Such a selection or screen of a non- 
thermostable polymerase can be, for example, the 
inducible or repressible - expression of an endogenous 
polymerase. Polymerases having altered fidelity can 
similarly be generated and selected from both prokaryotic 
and eukaryotic cells as well as viruses. Those skilled 
in the art will know how to apply the teachings described 
herein to the generation of polymerases having altered 
fidelity from such other organisms and such other cell 
types . 

Thus, the invention provides a general method 
for the production of a polymerase that has an altered 



fidelity in DNA or RNA synthesis. The method consists of 
producing a population of sufficient size and diversity 
so as to contain at least one polymerase molecule having 
an altered fidelity and then screening that population to 
identify the polymerase having altered fidelity. The 
altered polymerase fidelity can be either an increase or 
decrease in the accuracy of DNA synthesis. 

In one embodiment, the invention involves the 
production of a relatively large population of randomly 
mutagenized nucleic acids encoding a polymerase and 
introduction of the population into host cells to produce 
a library. The mutagenized polymerase encoding nucleic 
acids are expressed, and the library is screened for 
active polymerase mutants by complementation of a 
temperature sensitive mutation of an endogenous 
polymerase. Colonies which are viable at the 
non-permissive temperature are those which have 
polymerase encoding nucleic acids which code for active 
mutants . 

To generate a random population of polymerase 
mutants, a random sequence of nucleotides is substituted 
for a defined target sequence of a plasmid-encoded gene 
that specifies a biologically active molecule. In one 
application of this procedure, a double-stranded 
oligodeoxyribonucleotide is provided by hybridizing two 
partially complementary oligonucleotides, one or both of 
which contain random sequences at specified positions. 
The partially double-stranded oligonucleotide is filled 
in by DNA polymerase, cut at restriction sites and 
ligated into a DNA vector. The plasmid encodes the gene 
for a thermostable DNA polymerase, and the 
oligonucleotide is inserted in place of a portion of the 
gene that modulates the fidelity of DNA synthesis. After 



ligation, the reconstructed plasmids constitute a library 
of different nucleic acid sequences encoding the 
thermostable DNA polymerase and polymerase mutants. 

As disclosed herein, a genetic screen can be 
used to identify active polymerase mutants having altered 
fidelity. The library of nucleic acid sequences encoding 
polymerase and polymerase mutants are transfected into a 
bacterial strain such as E. coli strain recA72S polA12, 
which contains a temperature sensitive mutation in DNA 
polymerase. Exogenous DNA polymerases have been shown to 
functionally substitute for E. coli DNA polymerase I 
using E. coli strain recA718 polA12 and to complement the 
observed growth defect at elevated temperature, 
presumably caused by the instability of the endogenous 
DNA polymerase I at elevated temperatures (Sweasy and 
Loeb, J. Biol. Chem. 2 67:1407-1410 (1992); Kim and Loeb, 
Proc. Natl. Acad. Sci USA 92:684-688 (1995)). It was 
unknown, however, whether a thermostable polymerase could 
substitute for E. coli DNA polymerase given the distinct 
and harsh environment experienced by thermophilic 
organisms in which enzymes must function at extremely 
high temperatures. As disclosed herein, wild type Taq 
DNA polymerase I was found to complement the growth 
defect of E. coli strain - recA72 8 polAl2 (see Example I). 
Using such a complementation system, various mutant Taq 
DNA polymerase I mutants were identified in host bacteria 
that harbor plasmids encoding active thermoresis tant DNA 
polymerases that allowed bacterial growth and colony 
formation at elevated (restrictive) temperatures (see 
Examples I and II) . 

The invention also provides a method for 
identifying a thermostable polymerase having altered 
fidelity. The method consists of generating a random 
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population of polymerase mutants by mutating at least one 
amino acid residue in an active site O-helix of a 
thermostable polymerase and screening the population for 
one or more active polymerase mutants. 

5 The invention additionally provides a method 

for identifying a thermostable polymerase having altered 
catalytic activity. The method consists of generating a 
random population of polymerase mutants by mutating at 
least one amino acid residue of a thermostable polymerase 
10 and screening the population for one or more active 
polymerase mutants. 

A random population of polymerase mutants is 
generated by mutating one or more amino acid residues in 
an active site O-helix target sequence of a thermostable 

15 polymerase. The O-helix has been postulated to interact 
with the substrate template complex (Joyce and Steitz, 
supra, (1994)) . The O-helix has been observed in the 
crystal structure of E . coli DNA polymerase I Klenow 
fragment and Taq DNA polymerase (Beese et al., Science 

20 260:352-355 (1993); Kim et al . , Nature 376:612-616 

(1995)). As disclosed in Example II, random sequences 
were substituted for nucleotides encoding amino acids 
Arg659 through Tyr671 of -the O-helix of Taq DNA 
polymerase I to generate a random population of 

25 polymerase mutants. 

Using a genetic complementation screen, a 
variety of active Taq DNA polymerase I mutants were 
identified (see Example II) . Several amino acid residues 
were found to be immutable or nearly immutable based on 
30 the complementation assay. These immutable or nearly 

immutable amino acid residues in the O-helix are Arg659, 
Lys663, Phe667 and Tyr671. As used herein, a wild type 
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amino acid is designated as a residue preceding the 
number of the amino acid position. A mutated amino acid 
is designated as a residue following the number of the 
amino acid position. These immutable or nearly immutable 
5 sites are unable to be altered and still maintain the 

function of the DNA polymerase. Due to their position in 
the active site O-helix of Taq DNA polymerase I, these 
immutable or nearly immutable residues provide critical 
residues that are required for the activity of the 
10 polymerase. 

In addition to the O-helix of a polymerase, 
other regions of the polymerase can be targeted for 
random mutagenesis to generate a library of polymerase 
mutants to identify polymerase mutants having altered 

15 fidelity. Those skilled in the art can determine other 
regions to target for mutagenesis. Such other regions 
can be identified, for example, by sequence homology to 
other polymerases, which suggests conservation of 
function. Conserved sequences can also be used to 

20 identify target regions for mutagenesis based on activity 
studies of other polymerases. Protein structural models 
revealing the convergence of amino acid residues at the 
active site of a polymerase can similarly be used to 
identify target regions for mutagenesis. 

25 Alternatively, mutagenesis throughout the 

polymerase can be used to identify amino acid residues 
critical for polymerase function. Sequences containing 
these critical amino acid residues are target sequences 
for introducing random mutations to identify mutants 

30 having altered fidelity. Methods for identifying 

critical amino acid residues by introducing a small 
number of random mutations throughout a gene segment are 
well known to those skilled in the art and include, for 



example, copying by mutagenic polymerases, exposure of 
templates to DNA damaging agents prior to inserting into 
cells and replacement of regions of the DNA template with 
oligonucleotides containing sparsely populated random 
inserts. For example, a population of oligonucleotides 
with 91% correct substitutions and 3% of the 
non-complementary nucleotides at each position can be 
generated. Screening for polymerase mutants can be 
performed, for example, with the genetic complementation 
assay disclosed herein. 

The invention also provides a method for 
identifying a thermostable polymerase having altered 
fidelity. The method consists of generating a random 
population of polymerase mutants by mutating one or more 
amino acid residues adjacent to an immutable or nearly 
immutable residue in an active site O-helix of a 
thermostable polymerase and screening the population for 
one or more active polymerase mutants. 

In one embodiment, substitutions at amino acids 
adjacent to immutable or nearly immutable residues are 
used to identify polymerase mutants having altered 
fidelity. The adjacent amino acid residues can be 
immediately adjacent in the linear sequence or can be 
nearby. Adjacent residues that are nearby can be as many 
as two amino acids away from the immutable or nearly 
immutable residue in the linear sequence. A nearby 
residue can also be nearby in the three-dimensional 
structure of the polymerase and can be determined from a 
crystallographic molecular model of a polymerase. Nearby 
residues are in close enough proximity to an immutable or 
nearly immutable residue to modulate the activity of the 
polymerase. Generally, nearby residues are within two 
amino acid residues in the linear sequence from an 



immutable or nearly immutable residue or are within about 
5A of the immutable or nearly immutable residues, in 
particular within about 3A. 

Substitutions involving amino acid residues 
adjacent to immutable or nearly immutable sites have been 
found to alter the fidelity of DNA synthesis (see 
Examples IV and V) . The identified immutable or nearly 
immutable amino acid residues correspond to amino acid 
residues Arg659, Lys663, Phe667 and Tyr671 of Taq DNA 
polymerase I. Thus, the invention is directed to 
altering one or more amino acid residues adjacent to an 
amino acid residue corresponding to Arg659, Lys663, 
Phe667 or Tyr671 in Taq DNA polymerase. Amino acid 
residues adjacent to these immutable residues include, 
for example, amino acids corresponding to Arg660, Ala661, 
Ala662, Thr664, Ile665, Asn666, Gly668, Val669 and Leu670 
in Taq DNA polymerase I. Corresponding residues in other 
polymerases are also included and can be identified based 
on sequence homology or based on corresponding amino 
acids in structurally similar domains as defined by a 
crystallographic molecular model. 

The methods of the invention are also directed 
to altering residues immediately adjacent to the 
immutable or nearly immutable residues. Thus, the 
methods of the invention are directed to altering 
residues adjacent to required residues on DNA polymerases 
and identifying those mutations which have an effect on 
the fidelity of DNA synthesis. 

The invention further provides methods for 
determining a fidelity of the active polymerase mutant. 
The fidelity of active polymerase mutants can be 
determined by several methods. The active polymerases 
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can be, for example, screened for altered fidelity from 
crude extracts of bacterial cells grown from the viable 
colonies. Methods for determining fidelity of synthesis 
are disclosed herein (see Example III) . In one method, a 
5 primer extension assay is used with a biased ratio of 
nucleoside triphosphates consisting of only three of the 
nucleoside triphosphates. Elongation of the primer past 
template positions that are complementary to the deleted 
nucleoside triphosphate substrate in the reaction mixture 
10 results from errors in DNA synthesis. Processivity of 
high fidelity polymerases will terminate when they 
encounter a template nucleotide complementary to the 
missing nucleoside triphosphate whereas the low fidelity 
polymerases will be more likely to misincorporate a non- 
15 complementary nucleotide. The accuracy of incorporation 
for the primer extension assay can be measured by 
physical criteria such as by determining the size or the 
sequence of the extension product. This method is 
particularly suitable for screening for low fidelity 
20 mutants since increases in chain elongation are easily 
and rapidly quantitated. 

A second method for determining the fidelity of 
polymerase mutants employs a forward mutation assay. A 
template containing a single stranded gap in a reporter 

25 gene such as la.cZ is used for the forward mutation assay. 
Filling in of the gapped segment is carried out by crude 
heat denatured bacterial extracts harboring plasmids 
expressing a thermostable DNA polymerase mutant. For 
determining low fidelity polymerase mutants, reactions 

30 are carried out in the presence of equimolar 

concentrations of each nucleoside triphosphate. For 
determining high fidelity polymerase mutants, the 
reaction is carried out with a biased pool of nucleoside 
triphosphates. Using a biased pool of nucleoside 



triphosphates results in incorporation of errors in the 
synthesized strand that are proportional to the ratio of 
non-complementary to complementary nucleoside 
triphosphates in the reaction. Therefore, the bias 
exaggerates the errors produced by the polymerases and 
facilitates the identification of high fidelity mutants. 
The fidelity of DNA synthesis is determined from the 
number of mutations produced in the reporter gene. 

Procedures other than those described above for 
identifying and characterizing the fidelity of a 
polymerase are known in the art and can be substituted 
for identifying high or low fidelity mutants. Those 
skilled in the art can determine which procedures are 
appropriate depending on the needs of a particular 
application. 

Also provided herein is an isolated 
thermostable polymerase mutant having altered fidelity. 
The polymerase mutant has one or more mutated amino acid 
residues in the active site O-helix of a thermostable 
polymerase. Additionally provided is an isolated 
thermostable polymerase mutant having altered fidelity. 
The polymerase mutant has one or more mutated amino acid 
residues adjacent to an -immutable or nearly immutable 
amino acid residue in the active site O-helix of a 
thermostable polymerase. The mutated amino acid residue 
is adjacent to an amino acid residue corresponding to 
Arg659, Lys663, Phe667 or Tyr671 in Taq DNA polymerase. 

The invention also provides an isolated 
thermostable polymerase mutant having altered fidelity, 
where the polymerase has one or more mutated amino acid 
residues adjacent to an amino acid residue corresponding 



to Arg659, Lys663, Phe667 or Tyr671 in Taq DNA polymerase 
and the mutant is a high fidelity mutant. 

Using the methods of the invention, a number of 
mutants have been identified as having high fidelity of 
DNA synthesis. For example, polymerases having one or 
more single-base substitutions adjacent to Arg659, 
Lys663, Phe667, and Tyr671 in the nucleotide sequence of 
Taq DNA polymerase I have been identified. Specific 
examples of these high fidelity mutants include, for 
example, polymerases having the single substitutions 
Asn666Asp, Asn666Ile, Ile665Leu, Leu670Val, Arg660Tyr 
Arg660Ser, Gly668Arg, Arg660Lys, Gly668Ser and Gly668Gln; 
polymerases having the double substitutions consisting of 
Thr664Ile together with Asn666Asp, and Ala661Ser together 
with Val669Leu; as well as polymerases having the triple 
substitutions consisting of Thr664Pro, Ile665Val together 
with Asn666Tyr, and Ala661Glu, Ile665Thr together with 
Phe667Leu. Additional high fidelity mutants include, for 
example, Phe667Leu and Phe667Tyr. 

The invention provides a high fidelity 
polymerase mutant having one or more amino acid 
substitutions selected from the group consisting of 
Phe667Leu; Asn666Asp; Asn666Ile; Ile665Leu; Leu670Val; 
Arg660Tyr; Arg660Ser; Gly668Arg; Arg660Lys; Gly668Ser; 
Gly668Gln; Thr664Ile and Asn666Asp; Ala661Ser and 
Val669Leu; Ala661Glu, Ile665Thr, and Phe667Leu; and 
Thr664Pro, Ile665Val and Asn666Tyr. The polymerase 
mutant Phe667Tyr has been previously described and is 
excluded from the compositions of the invention. 

The invention also provides an isolated 
thermostable polymerase mutant having altered fidelity, 
where the polymerase has one or more mutated amino acid 
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residues adjacent to an amino acid residue corresponding 
to Arg659, Lys663, Phe667 or Tyr671 in Taq DNA polymerase 
and the mutant is a low fidelity mutant. The invention 
additionally provides a low fidelity polymerase mutant 
5 having one or more amino acid substitutions selected from 
the group consisting of Ala661Glu; Ala661Pro; Thr664Pro; 
Thr664Asn; Thr664Arg; Asn666Val; Thr664Pro and Val669Ile; 
Arg660Pro and Leu670Thr; Arg660Trp and Thr664Lys; 
Ala662Gly and Thr664Asn; Ala661Gly and Asn666Ile; 
10 Ala661Pro and Asn666Ile; and Ala661Ser, Ala662Gly, 
Thr664Ser and Asn666lle. 

Low fidelity mutant DNA polymerases include 
mutations involving substitutions at Ala661, Thr664, 
Asn666, and Leu670. Specific examples of low fidelity 

15 mutants include, for example, polymerases having the 
single substitutions Ala661Glu, Ala661Pro, Thr664Pro, 
Thr664Asn, Thr664Arg and Asn666Val; polymerases having 
the double substitutions consisting of Thr664Pro together 
with Val669Ile, Arg660Pro together with Leu670Thr, 

20 Arg660Trp together with Thr664Lys, Ala664Gly together 
with Thr664Asn, Ala661Gly together with Asn666lle, and 
Ala661Pro together with Asn666Ile; as well as polymerases 
having four substitutions consisting of Ala66lSer, 
Ala662Gly, Thr664Ser together with Asn666lle. 

25 For both the high fidelity and the low fidelity 

mutations described above, the invention provides 
polymerases other than Taq DNA polymerase having 
mutations at corresponding positions. In particular, the 
invention provides thermostable polymerases other than 

3 0 Tag DNA polymerase that have mutations at corresponding 
positions and that have altered fidelity. Those skilled 
in the art can determine corresponding positions based on 
sequence homology between the polymerases. 
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The invention also provides an isolated nucleic 
acid molecule encoding a polymerase mutant having high 
fidelity. The nucleic acid molecule contains a 
nucleotide sequence encoding substantially an amino acid 
5 sequence of Taq DNA polymerase I having one or more amino 
acid substitutions selected from the group consisting of 
Phe667Leu; Asn666Asp; Asn666Ile; Ile665Leu; Leu670Val; 
Arg660Tyr; Phe667Tyr; Arg660Ser; Gly668Arg; Arg660Lys; 
Gly668Ser; Gly668Gln; Thr664Ile and Asn666Asp; Ala661Ser 
10 and Val669Leu; Ala661Glu, Ile665Thr, and Phe667Leu; and 
Thr664Pro, Ile665Val and Asn666Tyr. 

Additionally provided is an isolated nucleic 
acid molecule encoding a polymerase mutant having low 
fidelity. The nucleic acid molecule contains a 
nucleotide sequence encoding substantially an amino acid 
sequence of Taq DNA polymerase I having a substitution of 
one or more amino acids selected from the group 
consisting of Ala661, Thr664, Asn666 and Leu670. The 
invention also provides a polymerase mutant having one or 
more amino acid substitutions selected from the group 
consisting of Ala661Glu; Ala661Pro; Thr664Pro; Thr664Asn; 
Thr664Arg; Asn666Val; Thr664Pro and Val669lle; Arg660Pro 
and Leu670Thr; Arg660Trp and Thr664Lys; Ala664Gly and 
Thr664Asn; Ala661Gly and- Asn666Ile; Ala66lPro and 
Asn666Ile; and Ala661Ser, Ala662Gly, Thr664Ser and 
Asn666lle. 

The invention also provides methods for the 
identification of one or more mutations in a gene using 
the high fidelity mutant DNA polymerases of the 
30 invention. For example, the use of a high fidelity 
mutant to amplify a gene of interest gives greater 
confidence that the amplified sequence will more 
accurately reflect the actual sequence in the sample and 
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minimizes the introduction of artifactual mutations 
during amplification of the gene. The higher accuracy of 
gene amplification provided by a high fidelity mutant 
also improves the identification of genetic mutations due 
to the increased confidence that observed mutations are 
more likely to reflect genetic mutations in the sample 
rather than artifactual mutations introduced during 
amplification . 

Additionally, the invention provides methods 
for identifying one or more mutations in a gene by 
amplifying the gene using a high fidelity polymerase 
mutant under conditions which allow polymerase chain 
reaction amplification. The gene is amplified by 
exposing the strands of the gene to repeated cycles of 
denaturing, annealing and elongation to produce an 
amplified gene product. Methods for amplifying genes 
using PCR are well known to those skilled in the art and 

include those described previously in PCR Primer. A 

Laboratory Manual , Dieffenbach and Dveksler, eds . , Cold 
Spring Harbor Press, Plainview, New York (1995) . The 
presence or absence of one or more mutations in the gene 
can be determined by sequencing the amplified product 
using methods well known to those skilled in the art. 

The invention provides methods for accurately 
copying repetitive nucleotide sequences by amplifying the 
repetitive nucleotide sequence using a high fidelity 
polymerase mutant. The repetitive nucleotide sequence 
can be in a gene or in a microsatellite between genes. 
The methods of amplifying the repetitive nucleotide 
sequences are carried out under conditions which allow 
PCR amplification with repeated cycles of denaturing, 
annealing and elongation as described above. 



The high fidelity mutants of the invention are 
advantageous for copying repetitive nucleotide sequences 
such as repetitive DNA because polymerases found in 
nature undergo slippage when copying DNA containing 
repetitive sequences. Therefore when polymerases found 
in nature are used, the amplification products of a 
nucleotide sequence containing a repetitive sequence do 
not accurately reflect the size or sequence of a DNA 
sequence in a sample. However, the use of a high 
fidelity polymerase mutant greatly increases the accuracy 
of an amplification product to reflect the actual size 
and sequence of the repetitive DNA sequence in the 
sample. Repetitive DNA can be found in microsatellites , 
which contain multiple repetitive nucleotide sequences 
and are dispersed throughout the genome. These 
repetitive di-, tri- and tetranucleotides are frequently, 
but not invariably, located between genes. 

The invention also provides a method for 
determining an inherited mutation by amplifying a gene 
using a high fidelity polymerase mutant. Such an 
inherited mutation can be correlated with a genetic 
disease, thereby allowing diagnosis of the genetic 
disease. The invention additionally provides methods for 
diagnosing a genetic disease by amplifying a gene using a 
high fidelity polymerase mutant. A genetic disease is 
one in which a disease is caused by a genetic mutation in 
a coding or non-coding region of DNA. Such a genetic 
mutation can be a somatic mutation or a germline 
mutation. The methods of the invention can be used to 
diagnose any genetic disease using high fidelity 
polymerase mutants. Such genetic diseases can involve 
point mutations, insertions and deletions. 



The methods of the invention employ high 
fidelity polymerase mutants and can similarly be used to 
diagnose genetic diseases involving repetitive DNA. In 
one embodiment, the genetic disease involves mutations in 
a microsatellite or repetitive DNA. Microsatellites are 
relatively stable in normal cells but are found to be 
unstable and to vary in length in some forms of 
hereditary and non-hereditary cancer, including 
hereditary nonpolyposis colorectal cancer (HNPCC) , other 
cancers that arise in HNPCC families, Muir-Torre syndrome 
and small-cell lung cancer (Loeb, Ganger Res. 54:5059- 
5063 (1994); Brentnall, Am. J. P athol. 147:561-563 

(1995); Honchel et al., Semin. Ce ll Biol. 6:45-52 (1995); 
Eshleman and Markowitz, Curr. Ooin . Oncol. 7:83-89 

(1995) ) . Microsatellite instability appears to be 

confined to tumors and is not present in normal tissues 

of affected individuals. 

The accuracy of amplification products of 
repetitive DNA sequences provided by the high fidelity 
mutants of the invention can be used to diagnose diseases 
involving mutations in repetitive DNA sequences. For 
example, with tumor samples, the accurate amplification 
of repetitive DNA sequences can be used to diagnose those 
cancers involving variable length in microsatellite DNA. 
Since microsatellite instability appears to be confined 
to tumors, amplification of repetitive DNA using the high 
fidelity mutants of the invention can additionally be 
applied to determining the prognosis or extent of disease 
of a cancer patient, evaluating outcomes of therapy, 
staging tumors and determining tumor status. High 
fidelity mutants of the invention can also be applied to 
amplify DNA in blood samples to identify circulating 
cells containing microsatellite instability as an 
indicator of a cancerous state. 



Other genetic diseases also involve repetitive 
DNA sequences, in particular, unstable triplet repeats. 
These unstable triplet repeat diseases involve increasing 
lengths of triplet repeat regions, ranging from -50 
repeats in normal individuals, -200 repeats in carriers 
to -2000 repeats in affected individuals. Such unstable 
triplet repeat diseases include, for example, fragile X 
syndrome, spinal and bulbar muscular atrophy, myotonic 
dystrophy, Huntington's disease, spinocereballar ataxia 
type 1, fragile X E mild mental retardation and 
dentatorubral pallidoluysian atrophy (Monckton and 
Caskey, r.i mil at. ion 91:513-520 (1995)). The diagnosis of 
unstable triplet repeat diseases is particularly valuable 
since the onset of symptoms can occur later in some 
diseases and the severity of the symptoms of some 
diseases can be correlated with the size of the extended 
triplet repeat region. Thus, amplification of these 
triplet repeat regions to more accurately reflect the 
actual size of the triplet repeat in the individual 
provides more accurate diagnosis and prognosis of the 
disease. Amplification of the large expanded regions 
associated with triplet repeat diseases can be carried 
out using low fidelity polymerase mutants of the 
invention since low fidelity polymerase mutants would be 
more likely to copy through very long stretches of 
repetitive nucleotide sequences. 

One method for identifying a genetic disease 
involves utilization of primers that hybridize to 
specific genes. The primers contain 3 '-terminal 
nucleotides complementary to the corresponding nucleotide 
in the mutant but not to the wild type gene. The 
mismatched primer is used to extend the primer template 
in the presence of a high fidelity mutant polymerase. 



The presence of an extension product is indicative of a 
mutant gene. 

The mismatch PCR method is based on the fact 
that a PCR primer that is not complementary to the 
template at the 3' end is an inefficient substrate for 
polymerases such as Taq DNA polymerase I. Wild type Taq 
DNA polymerase will occasionally misextend a mismatched 
primer, resulting in a false positive in an assay for a 
gene mutation. For example, a mutant gene with a rare TT 
mutation would be difficult to specifically amplify out 
of a pool of DNA molecules containing a wild type CC at 
the position of the TT mutant because wild type Taq DNA 
polymerase would occasionally misextend the wild type 
gene using the mismatched primer. In contrast, a high 
fidelity polymerase would not extend the mismatched 
primer. The products of a high fidelity polymerase in 
the mismatch PCR assay would therefore correspond to the 
mutant gene and would have fewer false positives than 
that observed with wild type Taq DNA polymerase. Thus, 
the more discriminating assay based on the use of high 
fidelity polymerases results in a better assay for 
detecting somatic mutations. The use of high fidelity 
mutants in such a mismatch-PCR based assay is disclosed 
herein (see Example V) . - 

The invention also provides a method for 
randomly mutagenizing a gene by amplifying the gene using 
the low fidelity polymerase mutants of the invention. 
The low fidelity polymerase mutants exhibit an efficiency 
of accurate base incorporation that is less than that of 
wild type polymerases. The efficiency of the low 
fidelity polymerase mutant is about 50% or more, 
generally 10% or more, and particularly 1% or more than 
that of a wild type polymerase. These low fidelity 



polymerase mutants would therefore exhibit between 2-fold 
to 100-fold lower fidelity than wild type polymerase. 
The introduction of mutations into specific genes using 
low fidelity polymerase mutants of the invention is 
useful for determining the effects of mutations on the 
function of those gene products. 

It is understood that modifications which do 
not substantially affect the activity of the various 
embodiments of this invention are also included within 
the definition of the invention provided herein. 
Accordingly, the following examples are intended to 
illustrate but not limit the present invention. 

EXAMPLE I 

Random Sequence Mutagenesis and Identificati on of Active 
Tag DNA Polymerase Mutants 

This example demonstrates random nucleotide 
sequence mutagenesis of a polymerase target sequence and 
identification of active polymerase mutants. 

Random sequence mutagenesis was used to 
introduce mutations into the O-helix of Taq DNA 
polymerase. Briefly, the Taq DNA polymerase I gene was 
obtained from the bacterial chromosome by cloning in 
pKK223-3 (Pharmacia Biotech, Piscataway, NJ) . A 3.2-kb 
fragment containing the Taq DNA polymerase I gene, 
including the 5 '-3' exonuclease domain and the tac 
promoter region, was further transferred into the Sail 
site of pHSG576 (pTacTaq) . The Taq DNA polymerase I gene 
was sequenced to confirm wild type sequence except for 
the lack of the N-terminal three amino acids. 



A vector containing a nonfunctional insert 
within the Taq DNA polymerase I gene was constructed and 
subsequently replaced with an oligonucleotide containing 
the random sequence to avoid contamination with 
incompletely cut vectors . To generate the nonfunctional 
vector, a SacII site was produced using site-directed 
mutagenesis by changing 2070C to G using a synthetic 
oligomer, 5 ' -GGG TCC ACG GCC TCC CGC GGG ACG CCG AAC ATC 
CAG CTG (SEQ ID NO: 3) (SacII-2) and the single-stranded 
piasmid pFC85 (Kunkel, Proc. Natl. Acad. Sci . USA 82:488- 
492 (1985)). The BstXl-Nhel fragment that carries the 
SacII site was substituted for the corresponding fragment 
in pTacTaq (pTacTaqSac) . A SacII-Nhel fragment in 
pTacTaqSac was further replaced with the synthetic 
oligomer 5 ' -GGA CTG CAT ATG ACT G ( SEQ ID NO : 4 ) (DUM-U) 
hybridized with 5 ' -CTA GCA GTC ATA TGC AGT CCG C 
(SEQ ID NO:5) (DUM-D) to create the nonfunctional vector 
(Dube et al., Biochemistry 3 0:117 60-117 67 (1991)). 

Oligonucleotides containing 9% random sequence, 
in which each nucleotide indicated in parentheses was 91% 
wild type nucleotide and 3% each of the other three 
nucleotides, were synthesized by Keystone Laboratories 
(Menlo Park, CA) : 0+9 RANDOM is 5 ' -CGG GAG GCC GTG GAC 
CCC CTG ATG (CGC CGG GCG- GCC AAG ACC ATC AAC TTC GGG GTC 
CTC TAC) GGC ATG TCG GCC CAC CG (SEQ ID NO: 6); O-O RANDOM 
is 5'-TGG CTA GCT CCT GGG AGA GGC GGT GGG CCG ACA TGC C 
(SEQ ID NO: 7) . The 17 nucleotide sequences at the 3' 
ends of the two oligonucleotides are complementary. 
Equimolar amounts of these oligonucleotides (20 pmol) 
were mixed, hybridized, and extended by five cycles of 
PCR reaction (94°C for 30 sec, 57°C for 30 sec, and 72°C 
for 30 sec) in a 100 ul reaction mixture containing 10 mM 
Tris-HCl (pH 8.3), 50 mM KC1, 1.5 mM MgCl 2 , 0.001% 
gelatin, 50 uM dNTPs, and 2.5 units of Taq DNA polymerase 



I. This PCR product (10 ul) was further amplified 25 
cycles with 20 pmol of 0(+) PRIMER (5'-TTC GGC GTC CCG CGG 
GAG GCC GTG GAC CCC CT) (SEQ ID NO: 8) and 20 pmol of 
0 (-) PRIMER (5'-GTA AGG GAT GGC TAG CTC CTG 
GGA) (SEQ ID NO: 9) under the same conditions. The 
amplified product was purified by phenol/chloroform 
extraction followed by ethanol precipitation and 
digestion with the restriction enzymes, SacII and Nhel, 
at 37°C for 30 min in 50 mM Tris-HCl (pH 7.9), 50 mM 
NaCl, 10 mM MgCl 2 and 1 mM dithiothreitol . The 
restriction fragment containing the random sequence was 
purified by phenol/chloroform extraction, ethanol 
precipitation, and filtration using a Microcon 30 filter 
(Amicon, Beverly, MA) . For the totally random library, 
five oligonucleotides (80-mers) , each having totally 
random sequence at one of the codons 659, 660, 663, 667 
or 668, were combined in equal amounts and hybridized to 
O-O RANDOM. After extension and digestion with 
endonucleases, the combined products were purified and 
processed as above. 

A random library of Taq DNA polymerase genes 
containing randomized nucleotide sequence corresponding 
to the 0-helix was generated by digesting the vector 
containing the nonfunctional insert with Nhel and SacII 
restriction endonucleases. The large DNA fragment was 
isolated by electrophoresis in a 0.8% agarose gel and 
purified by using GenCleanll (BiolOl, Vista, CA) . This 
large fragment, lacking the nonfunctional insert, was 
ligated with an oligonucleotide containing randomized 
sequence by incubating overnight at 16°C with T4 DNA 
ligase. The ligation mixture was then used to transform 
DH5a by electroporation according to Bio-Rad (Hercules, 
CA) . After electroporation, 1 ml of SOC (2% 
bactotryptone/0 .5% yeast extract/10 mM NaCl/2.5 mM KC1/10 



mM MgCl 2 /10 mM MgSO 4 /20 mM glucose) was added and 
incubation continued for 1 h at 37°C. An aliquot was 
plated on 2xYT (16 g/liter tryptone, 10 g/liter yeast 
extract, 5 g/liter NaCl, pH 7.3) containing 30 ug/ml 
chloramphenicol to determine the total number of 
transf ormants , and the remainder was inoculated into 500 
ml of 2xYT containing 30 ug/ml chloramphenicol and 
cultured at 37 °C overnight. Plasmids (random library 
vector) were purified and used for transformation of 
recA718 polA12 strain. 

For genetic complementation to determine active 
polymerase mutants, E . coli recA719 polA12 cells (SC18-12 
E. coli B/r strain, which has the genotype recA718 polA12 
uvrAl55 trpE65 lon-11 sulAl) were transformed with 
plasmids pHSG576 or pTacTaq by electroporation (Bio-Rad 
Genepulser, 2kV, 25 uFD, 400 Q) (Sweasy and Loeb, supra, 
(1992); Sweasy and Loeb, Proc. Natl. Acad. Sci . USA 
90:4626-4630 (1993); Witkin and Roegner-Maniscalo, i. 
Racteriol. 17 4:4166-4168 (1992)). Thereafter, 1 ml of 
nutrient broth (NB) (8 g/liter) containing NaCl 
(4 g/liter) and 1 mM isopropyl |3-D-thiogalactoside (IPTG) 
was added and the mixture was incubated for 1 h at 37°C. 
The transformed cells were plated on nutrient agar plates 
(containing 23 g/liter Difco nutrient agar, 5 g/liter 
NaCl, 30 ug/ml chloramphenicol, 12.5 ug/ml tetracycline 
and 1 mM IPTG) and grown at 30°C overnight. Single 
colonies were transferred to NB for growth to logarithmic 
phase at 30°C. Thereafter, -10 ul (10 4 cells) was 
introduced at the center of an agar plate, and the 
inoculation loop was gradually moved from the center to 
the periphery as the plate was rotated. Duplicate plates 
were incubated at 30°C or 37°C for 30 h. To determine 
complementation efficiency by Tag DNA polymerase I and to 
isolate mutants, cultures of the zecAlld polA12 strain 



harboring either pHSG57 6 or Taq DNA polymerase I were 
diluted with NB medium and plated (-500 colonies per 
plate) . Duplicate plates were incubated at 30°C or 37°C, 
and visible colonies were counted after a 30 h 
incubation. Complementation was verified by a second 
round of electroporation and colony formation at the 
nonpermissive temperature. Cell-free extracts were 
prepared from selected colonies obtained at the 
restrictive temperature and assayed to confirm that they 
contained a temperature-resistant DNA polymerase activity 
(Lawyer et al., J- Biol. Chem. 264:6427-6437 (1989)). 

Wild type Taq DNA polymerase I was tested for 
its ability to complement a temperature sensitive 
polymerase contained in the E. coli strain recA718 
polA12, which is unable to grow at 37°C in rich media at 
low cell density (Witkin and Roegner-Maniscalo, 1992, 
supra) . The temperature sensitive phenotype of E. coli 
strain recA718 polA12 was complemented by transformation 
with the pTacTaq plasmid encoding wild type Tag DNA 
polymerase I as indicated by growth at 37°C. Therefore, 
this E. coli strain containing a temperature sensitive 
polymerase provides a good model system for testing Taq 
DNA polymerase I mutants. 

To evaluate the involvement of different amino 
acid residues in catalysis by Taq DNA polymerase I, 
random sequences were substituted for nucleotides 
encoding a portion of the substrate binding site of Taq 
DNA polymerase I (O-helix, amino acids Arg659 through 
Tyr671) . The substituted stretch was 39 nucleotides long 
with 9% randomization. At each position the proportion 
of the wild type residue was 91% and the other 3 
nucleotides were present in equal amounts (3% each) . 
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A library of 50,000 independent mutants was 
obtained. The number of colonies obtained at 37°C was 
11.8% of that obtained at 30°C. Therefore, screening a 
randomized library using E. coli strain recA718 polA12 
provided approximately 5900 colonies containing active 
Taq DNA polymerase and potential polymerase mutants. 

These results show that a randomized library 
can be used to generate a population of polymerase 
mutants. These results also show the identification of 
active Taq DNA polymerase I mutants by screening for 
active polymerase mutants using genetic selection. 

Tdentification of Tag DNA Polymerase I Mutants a nd 
Tnrnmfcafale or Nearly Immutable Ami no Acid Residues 

This example describes the identification Taq 
DNA polymerase I mutants generated by a randomized 
library and the identification of immutable or nearly 
immutable amino acid residues. 

The active Taq DNA polymerase I mutants 
identified by the screen described in Example I were 
further characterized. The entire random nucleotide- 
containing insert was sequenced from a total of 234 
plasmids obtained at 37°C (positively selected), 16 
plasmids obtained at 30°C (nonselected) and 29 plasmids 
obtained at 30°C, which failed to grow at 37°C (negatively 
selected) . All substitutions were in the randomized 
nucleotides except for 12 clones. 



Among the 230 positive plasmids, 168 contained 
silent mutations in one or more codons . At the amino 



acid level, 106 encoded the wild type residue and 124 
encoded substitutions, in accord with the expected 
distribution in the plasmid population. Of the 124 
plasmids with amino acid changes, 40 were unique mutants 
obtained just once. The remaining 8 4 plasmids 
represented 21 different mutants. At least 79% of those 
encoding the same amino acid substitutions were 
independently derived since they contained different 
silent mutations in other codons . In total, 61 different 
amino acid sequences were obtained that complemented the 
temperature-sensitive phenotype of the recA718 polA12 
host . 

A compilation of the amino acid substitutions 
found in Taq DNA polymerase I is shown in Figure 2. 
Solid boxes indicate the amino acid residues for which no 
substitutions were detected. Dashed boxes mark the amino 
acid positions where only conservative substitutions were 
found. The amino acid positions of Taq DNA polymerase I 
and corresponding positions of E. coli DNA polymerase I 
are indicated at the top. WT represents the wild type 
sequence and randomized amino acids are written in 
boldface type. The amino acids that have not been found 
in the DNA polymerase I family are outlined (Braithwaite 
and Ito, Nucleic Acids Res. 21:787-802 (1993)). Panel A 
shows single mutations selected from the 9% library 
listed under the wild type amino acids. Panel B shows 
the sequence of each multiply substituted mutant selected 
from the 9% library. Panel C shows mutations selected 
from the totally random library. 

The distribution of single amino acid 
substitutions among the active mutants was not random 
(see Figure 2A) . For example, numerous diverse 
substitutions were observed at Ala661 and Thr664. In 



contrast, no substitutions were detected at five 
positions (Arg659, Arg660, Lys663, Phe667 and Gly668) . 
This uneven distribution of replacements is unlikely to 
be the result of a bias in the nucleotide composition of 
the random insert since sequencing of both the 
nonselected and negatively selected plasmids revealed 
multiple nucleotide substitutions at each of the targeted 
positions and because silent mutations were detected at 
each of these positions in the selected clones. 

A nonrandom distribution of substitutions was 
also observed among active mutants containing multiple 
substitutions (see Figure 2B) . Again, Ala661 and Thr664 
were replaced with a variety of residues. However, no 
amino acid substitutions were observed in place of 
Arg659, Lys663 and Gly668, even though different silent 
nucleotide substitutions were found at each of these 
positions. A comparison of Figure 2A and B shows that 
substitutions at Arg660 and Phe667 occur only in the 
presence of substitutions at other positions. In 
addition to the mutants containing multiple substitutions 
shown in Figure 2B, two additional triple mutants were 
also found: mutant 44, with Ala661Pro, Thr664Arg, and 
Val669Leu; and mutant 54, with Ala661Thr, Thr664Pro and 
Ile665Val. 

The partially substituted library (9%) does not 
provide a vigorous test of the immutability of specific 
codons. Only 0.07% of sequences at each codon would be 
expected to contain nucleotide substitutions at all three 
positions. To further probe the mutability of specific 
amino acid residues, a second library was constructed 
that contained totally random substitutions at a limited 
number of designated codons. In this library, 
nucleotides encoding each of the five amino acids Arg659, 



Arg660, Lys663, Phe667 and Gly668 were randomized. These 
were amino acid positions that did not yield single 
substitutions in the 9% random library (Figure 2A) . 
Approximately 1300 transf ormants , which is 4 times more 
than the number required for each possible substitution 
at each of the target codons, were screened. At the 
nonpermissive temperature, 113 colonies were obtained, 84 
of which contained codons that encoded the wild type 
amino acid sequence. Most of the amino acid 
substitutions occurred in place of Arg660 or Gly668 . 

Again, Arg659 and Lys663 were completely 
conserved, with 16 and 5 silent mutations scored at these 
codons, respectively. The expected number of silent 
mutations were 21 and 4.2, respectively, assuming that 
the 5 randomized oligomers that comprised the library 
were mixed in equimolar proportions. These numbers show 
that the oligomers were roughly equally represented in 
the library and that sufficient mutants were sampled to 
conclude that Arg659 and Lys663 are immutable in these 
genetic complementation experiments (P < 0.05 for Met and 
Trp, P < 0.01 for all other substitutions). Only Tyr 
substituted for Phe at position 667 (Figure 2C) , and six 
silent mutations were scored for this codon. An 
additional mutant obtained with the totally randomized 
library but not shown in Figure 2 is mutant 601, with 
double substitutions Ile665Asn and Val669Ile. 

These results show that generating a random 
library and screening by genetic complementation provided 
a number of active Taq DNA polymerase I mutants. These 
results also show that amino acid residues Arg659 and 
Lys663 were found to be immutable and Phe667 and Tyr671 
were found to tolerate only conservative substitutions. 



EXAMPLE III 

Determination of the Fidelity of Active Taa DNA 
Polymerase I Mu tants 



This example describes methods of determining 
the fidelity of active Taq DNA polymerase I mutants. Two 
types of assays are useful for determining the fidelity 
of active polymerase mutants, a primer extension assay 
and a forward mutation assay. 

Crude extracts were used to determine the 
fidelity of polymerase mutants . A single colony of 
E. coli DH5a ( F~, <p80dlacZAM15 r A { lacZYA-argF) Ul 69 , deoR, 
recAl, endAl, phoA, hsdRU (r/m/) , supE44, A~ , thi-1, 
gyrA96, relAl) carrying wild type or mutant Taq DNA 
polymerase I was inoculated into 4 0 ml of 2xYT 
(16 g/liter tryptone, 10 g/liter yeast extract, 5 g/liter 
NaCl, pH 7.3) containing 30 mg/liter chloramphenicol. 
After incubation at 37 °C overnight with vigorous shaking, 
an equal amount of fresh medium with 0.5 mM IPTG was 
added, and incubation was continued for 4 h. Cells were 
harvested, washed once with TE buffer (10 mM Tris-HCl, 
pH 8.0, 1 mM EDTA) and suspended in 100 ul of buffer A 
(50 mM Tris-HCl, pH 8.0, 2.4 mM phenylmethylsulf onyl 
fluoride, 1 mM dithiothreitol , 0.5 mg/liter leupeptin, 
1 mM EDTA, 250 mM KC1) . Bacteria were lysed by 
incubating with lysozyme (0.2 mg/ml) at 0°C for 2 h. The 
lysate was centrifuged at 15,000 rpm (Sorvall, SA-600 
rotor) (DuPont, Newtown, CT) for 15 min, and the 
supernatant solution was incubated at 72°C for 20 min. 
Insoluble material was removed by centrif ugation . 

Polymerases were purified as described 
previously with some modifications (Lawyer et al . , PCR 
Methods Application 2:275-287 (1993). Briefly, a single 



colony of E. coli DH5a carrying wild type or mutant Taq 
DNA polymerase I was inoculated into 10 ml of 2xYT . Two 
ml of the inoculum was immediately added to each of 5 
bottles containing 1 liter of 2xYT with 30 rag/liter 
chloramphenicol. After overnight incubation at 37°C with 
vigorous shaking, 1 liter of 2xYT containing 30 mg/liter 
chloramphenicol and 0.5 niM IPTG was added, and incubation 
was continued for 4 h. Cells were harvested, washed once 
with TE buffer and suspended in 100 ml buffer A. 
Bacteria were lysed by incubating with lysozyme 
(0.2 mg/ml) at 0°C for 2 h and then sonicating on ice for 
45 sec by using a micro-tip probe (Sonifier, Branson 
Sonic Power, Danbury, CT) . 

The lysate was centrifuged at 15,000 rpm 
(Sorvall, SA-600 rotor) for 15 min, and the supernatant 
solution was incubated at 72°C for 20 min. Insoluble 
material was removed by centrif ugation . Ammonium sulfate 
(0.2 M) and Polymin P (0.6%) were added and the 
suspension was held on ice for 1 h. After removal of the 
precipitate by centrif ugation and filtration through a 
Costar 8310 filter, the filtrate was applied to a 
3 x 8-cm phenyl -SEPHAROSE HP (Pharmacia Biotech) column 
equilibrated with buffer A containing 0.2 M ammonium 
sulfate and 0.01% Triton- X-100 . The column was washed 
with the same buffer (300 ml) and activity was eluted 
with buffer B (TE buffer containing 0.01% Triton X-100 
and 50 mM KC1) . The eluate (100 ml) was dialyzed 
overnight against 4 liters of buffer B and loaded onto a 
0.8 x 8-cm heparin-SEPHAROSE CL6B (Pharmacia Biotech) 
column equilibrated with buffer B. After washing with 
buffer B (50 ml), activity was eluted in a 30 ml linear 
gradient of 50-500 mM KCl in TE buffer containing 0.01% 
Triton X-100. Active fractions were collected, dialyzed 
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against 50 mM Tris-HCl (pH 8.0) containing 50 mM KC1 and 
50% glycerol, and stored at -80°C. 

To confirm and quantitate the presence of 
polymerase activity, crude extracts or purified enzyme 
was incubated at 72°C for 5 min in 50 mM Tris-HCl 
(pH 8.0), 2 mM MgCl 2 , 100 uM each dATP, dGTP, dCTP and 
dTTP, 0.2 uCi of ( 3 H) dATP and 200 ug/ml activated calf 
thymus DNA. Incorporation of radioactivity into an acid- 
insoluble product was measured according to Battula and 
Loeb f J. Biol. Chem. 249:4086-4093 (1974) . One unit 
represents incorporation of 10 nmol of dNMP in 1 h, 
corresponding to 0.1 unit as defined by Perkin-Elmer . 

For the primer extension assay, the 14-mer 
primer 5 ' -CGCGCCGAATTCCC (SEQ ID NO: 10) was 32 P-labeled at 
the 5' end by incubation with (y- 32 P)ATP and T4 
polynucleotide kinase and annealed to an equimolar amount 
of the template 4 6-mer 

5 1 -GCGCGGAAGCTTGGCTGCAGAATATTGCTAGCGGGAATTCGGCGCG 
(SEQ ID NO: 11) . Heat-inactivated E. coli extracts 
containing 0.3-1 unit of wild type or mutant Taq DNA 
polymerases were incubated at 45°C for 60 min in 50 mM 
Tris-HCl (pH 8.0), 2 mM MgCl 2 , 50 mM KC1, 20 uM each dATP, 
dGTP, dCTP and dTTP and 1.4 ng of the annealed template 
primer. A set of four additional reactions, each lacking 
a different dNTP, was carried out for each polymerase. 
Purified enzyme (1 unit) was incubated for the times 
indicated under the same conditions as for crude 
extracts. After electrophoresis in a 14% polyacrylamide 
gel containing 8M urea, reaction products were analyzed 
by autoradiography. Extension was quantified by using an 
NIH imaging program (see http//www . nih . gov/ ) . 



For the forward mutation assay, the non-coding 
strand of the lacZa gene contained in 200 ng of gapped 
M13mp2 DNA was copied by using 5 units of wild type or 
mutant Taq DNA polymerase I in a reaction mixture 
containing 50 mM Tris-HCl (pH 8.0), 2 mM MgCl 2 and 50 mM 
KC1 (Feig et al. Proc. Natl. Acad . Sci . USA 91:6609-6613 
(1994)). For determining low fidelity polymerase 
mutants, the reaction included 20 uM each dNTP. For 
determining high fidelity polymerase mutants, the 
reaction was carried out with biased dNTP pools 
containing 0.5 mM of one dNTP and 20 mM of each of the 
other three dNTPs. For example, the reaction could 
contain 0 . 5 mM dATP and 20 mM each of dGTP, dCTP and 
dTTP. After incubation at 72°C for 5 min, the DNA was 
transfected into host E . coli and the plaques were scored 
for white and pale blue mutant plaques (Tindall et al . , 
Genetics 118:551-560 (1988)). 

These results show that the fidelity of active 
Taq DNA polymerase mutants can be determined using a 
primer extension assay and a forward mutation assay. 

EXAMPLE IV 

Identification of Low Fidelity Taq DNA Polymerase I 
Mutants 

This example shows the identification of low 
fidelity Taq DNA polymerase I mutants . 

The active Taq DNA polymerase I mutants 
identified in Example II were assayed by the methods 
described in Example III to identify low fidelity 
mutants. Screening for activity was carried out on 67 of 
75 sequenced mutants, including all 38 with single amino 



acid substitutions described in Figure 2. Plasmids 
encoding the mutant polymerases were cloned, purified and 
grown in E. coli, and host cells were analyzed for 
expression of Taq DNA polymerase I by measuring the 
activity of crude extracts. E. coli DNA polymerases and 
nucleases were inactivated by heating at 72°C for 20 min . 
The ability of heat-treated extracts to elongate primers 
in the absence of a complete complement of four dNTPs was 
then determined using a set of five reactions. One 
reaction contained all four complementary nucleoside 
triphosphates while each of the others lacked a different 
dNTP ("minus conditions") . Elongation in the minus 
reactions is limited by the rate of misincorporation at 
template positions complementary to the missing dNTP. 

A primer extension assay was performed on wild 
type Taq DNA polymerase I and several mutants, revealing 
that several mutants had elongation patterns that 
differed from wild type Taq DNA polymerase. In the 
presence of all four dNTPs, every extract examined 
extended more than 90% of the hybridized primer to a 
product of length similar to that of the template. In 
the minus reactions, wild type Taq DNA polymerase I 
extended 48-60% of the primer up to, but not opposite, 
the first template position complementary to the missing 
dNTP. The remaining primer was terminated opposite the 
missing dNTP, presumably by incorporation of a single 
non-complementary nucleotide, or was terminated further 
downstream, presumably by extension of the mispaired 
primer terminus. A variety of elongation patterns was 
observed for the 67 mutants. Thirteen mutants extended 
more of the primer and/or synthesized a greater 
proportion of longer products than the wild type enzyme 
in three or four of the minus reactions. For example, 
mutant 2 formed full-length products in reactions lacking 



dGTP or dTTP. This increased extension presumably 
reflects increased incorporation and/or extension of 
non-complementary nucleotides. Other mutants extended 
less of the primer or synthesized shorter products than 
the wild type enzyme, for example, mutant 5. In several 
cases, different amino acid substitutions at the same 
position either increased or decreased extension in 
comparable minus reactions . 

A compilation of amino acid replacements in the 
13 mutants that displayed increased extension in at least 
three of the minus reactions is shown in Table I. 



Table I. Low Fidelity Mutants of Taq DNA Polymerase I 
Identified in the Primer Extension Screen 



659 
R R A 



663 
K T 



671 
L Y 



29 

36 

40 

45 

53 
130 
156 
175 
206 
240 
247 
248 : 
306 



With the exception of Gly668, one or more substitutions 
that putatively reduce the accuracy of DNA synthesis were 



48 

observed for each of the 9 non-conserved amino acids. 
Eleven mutants harbored substitutions at either Ala661 or 
Thr664, including several single mutants. This initial 
screen with crude extracts suggested that a large number 
of changes are permitted in the O-helix that do not 
reduce the ability of Tag DNA polymerase I to complement 
the growth defect of recAHS polAl2. Many of the 
substitutions in the O-helix that do not reduce the 
ability of Taq DNA polymerase I to carry out functional 
complementation reduce the fidelity of DNA synthesis in 
vitro. 

To demonstrate that the reduction in fidelity 
exhibited by crude extracts is due to mutant Taq DNA 
polymerase I, wild type enzyme was purified as well as 
the three single mutants Ala661Glu, Ala661Pro and 
Thr664Arg. The mutant Ile665Thr, a mutant predicted to 
have no alteration in fidelity based on complementation 
assays, was also purified as a control. The mutated 
enzymes retained at least 29% of wild type activity in 
vitro, which is in accord with their ability to 
complement the growth defect caused in E. coli by 
temperature-sensitive host DNA polymerase I and ensures 
that analysis of fidelity will not be complicated by 
major impairments of catalytic efficiency. 

Primer extension assays were carried out with 
the homogenous mutant polymerases. Wild type Taq DNA 
polymerase I extended most of the primer to one 
nucleotide before the template position opposite the 
missing complementary dNTP in a 5 min reaction. Only 
about 30% of the primers were elongated further. In 
reactions containing equivalent activity, the mutant 
polymerases Ala661Glu, Thr664Arg and Ala661Pro extended 
larger proportion of the primers past the sites where tl 



wild type polymerase ceased synthesis. The control 
enzyme Ile665Thr yielded an elongation pattern similar to 
that of the wild type enzyme. Elongation reactions with 
the three polymerases were also carried out for 60 min . 
Again, Ala661Glu and Thr664Arg synthesized a greater 
proportion of longer products than obtained with the wild 
type and Ile665Thr polymerases. Notably, Ala661Glu, 
Thr664Arg and Ala661Pro synthesized longer products in 
5 min than the wild type did in 60 min. 

To further analyze the reduced fidelity 
exhibited by the low fidelity polymerase mutants, a time 
course of primer elongation was carried out. Wild type 
Taq DNA polymerase I extended 9% of the primers past the 
first deoxyguanosine template residue within the 60 min 
incubation period, but elongation past the second 
deoxyguanosine was not detected. In the same interval, 
Thr664Arg extended 93% of the primer past the first 
template deoxyguanosine, and elongation proceeded past as 
many as five template deoxyguanosines . Importantly, a 
comparable proportion of primers was extended at all time 
points, despite the striking difference in the length of 
the products. These time course data indicate that 
greater elongation reflects increased ability to utilize 
non-complementary substrates and primer termini, rather 
than a putative difference in the amount of activity 
present . 

In a forward mutation assay, the fidelity of 
DNA synthesis by the purified polymerases was guantitated 
by measuring the frequency of mutations produced by 
copying a biologically active template in vitro (Kunkel 
and Loeb, J. Biol. Chem 254:5718-5725 (1979)). The 
target sequence was the lacZa gene located within a 
single-stranded region in gapped circular double-stranded 
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Ml3mp2 DNA (Feig and Loeb, Ri ochemistrv 32:4466-4473 
(1993)). The gapped segment was filled by synthesis with 
the wild type or mutant enzymes. The double-stranded 
circular product was transfected into E. coli, and the 
5 mutation frequency was determined by scoring white and 
pale blue mutant plaques. A comparison of the specific 
activities and mutation frequencies of the purified 
enzymes is presented in Table II. After synthesis by 
wild type Taq DNA polymerase I, the mutation frequency 
10 was not greater than that of the uncopied control. 
Synthesis by Ala661Glu and Thr664Arg gave rise to 
mutation frequencies more than 7- and 25-fold greater, 
respectively, than that of the wild type polymerase. 

Table II . Mutation Frequency in the lacZa Forward 
Mutation Assay 

Taq Pol I Specific Plaques Scored Mutation 

Activity Total Mutant Frequency 





units/mg 






KlO- 3 


20 WT 


66, 000 


8, 637 


22 


2.5 


A661E 


45, 000 


6,782 


116 


17 .1 


T664R 


23, 000 


5,148 


324 


62.9 



A sample of independent, randomly chosen 
mutants produced by Thr664Arg was characterized by DNA 
sequence analysis using a THERMO SEQUENASE cycle 
sequencing kit (Amersham Life Science, Cleveland, OH) . 
Both base substitutions and frameshifts were found 
throughout the targeted lacZa gene and its regulatory 
sequence. Of the 64 independent plaques, 57 had 
mutations in the target. Other mutations presumably 
occurred outside the target region. Some had more than 
one base substitution and a total of 66 mutations were 




25 



30 
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observed (see Figure 3) . Among them, 61 were base 
substitutions. Transitions (38/61) were more frequent 
than transversions (23/61) . T - C transitions accounted 
for 31 of 61 base substitutions, while T - A (9/61), A - 
T (8/61) and G - A (5/61) substitutions were less 
frequent. This base substitution spectrum is essentially 
the same as that reported for wild type Taq DNA 
polymerase I (Tindall and Kunkel, supra, 1988) . From 
these data, the base substitution fidelity of Thr664Arg 
can be calculated as 8.6 x 1CT 4 or 1 error per 1200 
nucleotides. On the basis of the five frameshift mutants 
detected, the frameshift error can be calculated as 4.9 x 
10" 5 or 1 error per 20,000 nucleotides. 

These results show that low fidelity Taq DNA 
polymerase I mutants were identified from a randomized 
library using a genetic complementation screen. The 
fidelity of Taq DNA polymerase I mutants was determined 
by primer extension assays and forward mutation assays. 

EXAMPLE V 

Tdentification of Hiah Fidelity Tag D NA Polymerase I 
Mutants 

This example shows the identification of high 
fidelity Taq DNA polymerase I mutants. 

The active Taq DNA polymerase I mutants 
identified in Example II were assayed by the methods 
described in Example III to identify high fidelity 
mutants. A panel of 75 active polymerases was screened. 
Candidate high fidelity polymerase mutants are shown in 
Table III. 



Table III . Candidate High Fidelity Mutants of 
Taq DNA Polymerase I 

659 663 667 671 

WT: RRAAKTINFGVLY 



FL : L 
74 : E T L 

146 : D 

147 : I 
149 : ID 

169 : S L 

186 : L 

219 : P V Y 

254 : V 

407 : Y 

424 : Y 
426 : S 

487 : R 

488 : K 

530 : S 
614 : Q 



Thirteen of the active polymerases exhibited greater 
accuracy in DNA synthesis. Table IV summarizes the 
results of a forward mutation assay of some of these high 
fidelity mutants. Several polymerase mutants displayed 
higher fidelity than the wild type Taq DNA polymerase. 
Polymerase mutants exhibiting particularly high fidelity 
are mutant 424, with Phe667Tyr, mutant 426, with 
Arg660Ser and mutant 488, with Arg660Lys. 
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Table IV. Fidelity of Tag DNA Polymerase Mutants in a 
la.cZ Forward Mutation Assay 

Enzyme Total Mutant Mutation 

Plaques Plaques Frequency 









xlO~ 


Wild Type 


5680 


49 


8.6 


High Fidelity Mutants 






MS147 


7249 


47 


6.5 


MS169 


7275 


34 


5.1 


MS254 


6898 


40 


5.8 


MS424 


4810 


14 


2.7 


MS426 


5727 


23 


4.1 


MS488 


3442 


13 


1.5 


Low Fidelity Mutant 






MS206 


3333 


133 


40 



These results show that Taq DNA polymerase 
mutants were identified and found to exhibit higher 
fidelity than wild type Taq DNA polymerase. 

EXAMPLE VI 

Hierh Fidelity Tag DNA Polymerase Mu tants Enhance the 
Sensitivity of Mismatch PCR -based Assays for Somatic 
Mutations 

This example shows the use of high fidelity 
mutants obtained by mutating the active site O-helix of 
Taq DNA polymerase I to enhance the sensitivity of 
mismatch PCR-based assays for somatic mutations. 
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Mismatch PCR is the basis of allele-specif ic 
identification of inherited mutations within genes and 
somatic mutations that occur in tumors. In these 
studies, one compares the extension of a correctly- 
matched primer with the lack of extension using a primer 
with a 3 '-terminal mismatch. The rate of extension by 
DNA polymerase using a primer with a single mismatch 
compared to a primer with a 3 ' -complementary base pair 
(matched) terminus is approximately 1CT 5 (Perinno and 
Loeb, J. Biol. Chem. 262:2898-2905 (1989)). Elongation 
from a double mismatch is even less frequent, and thus 
offers an even more stringent test of the inability of 
mutant Taq DNA polymerases to elongate a mismatched 
primer terminus . 

A template containing the wild type sequence of 
human DNA polymerase-^ at nucleotide positions 8 8 6-889 
(CCCCTGGG) was utilized. PCR reactions were carried out 
with two complementary primers that flank the sequence 
(matched) or with one matched template and a second 
mismatched template containing a terminally mismatched 
primer with AA at the 3' terminal position. The AA would 
be across from the CC (underlined) in the template 
strand. In these studies, the ratio of templates 
containing the complementary and non-complementary 
sequences were varied. The PCR amplified product was 
separated by polyacrylamide gel electrophoresis and 
quantitated by phosphoimage analysis. Wild type Taq DNA 
polymerase detected one molecule of template containing a 
TT substitution in place of the two template CC when 
present in a population of 10 5 molecules containing the 
non-mutant templates with the CC substitution. In 
contrast, both of the high fidelity Taq DNA polymerase 
mutants, with substitutions Phe667Tyr and Arg659Ser, 
detected one molecule of the TT template amongst 10 8 



molecules of the CC template when the primer contained 
two terminal 3 ' -AA nucleotide residues. 



These results show that high fidelity Taq DNA 
polymerase mutants have two to three orders of magnitude 
enhanced sensitivity for detecting mutant DNA using a 
mismatch PCR-based assay. 

EXAMPLE VII 

High Fidelity Taq DNA Polymerase Mutants Enhance 
Sensitivity of Detection of Repetitive DNA Sequences 

This example demonstrates the use of high 
fidelity polymerase mutants to enhance the sensitivity 
and accuracy of amplifying repetitive DNA sequences. 

Detection of the length of unstable 
microsatellite DNA in certain human tumors has depended 
on PCR amplification of specific sequences and 
determination of changes in electrophoretic mobility in 
gels. Due to the slippage of DNA polymerase while 
copying repetitive DNA, the interpretation of the results 
of this method have remained unsatisfactory. 

High fidelity Taq DNA polymerases are 
identified using the methods described in Examples I and 
III. DNA templates containing runs of CA repeats with 
the number of repeats varying from 5 to 50 are used to 
test high fidelity Taq DNA polymerase mutants. After 20 
to 70 rounds of PCR amplification, the product of the 
reaction is displayed on polyacrylamide gels. High 
fidelity polymerase mutants which display less slippage 
errors copying the repetitive sequences are identified. 
These high fidelity polymerase mutants are used to 
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amplify repetitive DNA sequences in samples, for example 
tissue or tumor samples. 



These results show that high fidelity mutants 
having enhanced sensitivity and accuracy in amplifying 
5 repetitive DNA sequences can be identified and used to 
amplify repetitive DNA in tissue or tumor samples. 

Throughout this application various 
publications have been referenced. The disclosures of 
these publications in their entireties are hereby 
10 incorporated by reference in this application in order to 
more fully describe the state of the art to which this 
invention pertains. 

Although the invention has been described with 
reference to the disclosed embodiments, those skilled in 
15 the art will readily appreciate that the specific 
experiments detailed are only illustrative of the 
invention. It should be understood that various 
modifications can be made without departing from the 
spirit of the invention. 



